An NHANES application
In typical functional regression tasks, the mean relationship between the outcome, \(Y(s)\), and covariate vector \(X\) are modeled under GP assumption
\[ Y_i(s)=\mu_i(s;X)+\epsilon_i(s)~,\epsilon_i(s) \sim \mathcal{GP}(0, V(\cdot,\cdot)) \] \(V\) is the covariance operator, and does NOT depend on \(X\).
We approach this using mixed-effect modeling with splines: \[ \small \begin{aligned} Y_i(s)&=\overset{\text{(Fixed Effect)}}{\mu_i(s)}&+&\overset{\text{(Random Effect)}}{b_i(s)}&+&\overset{\text{(Residual Noise)}}{\epsilon_i(s)}\\ Y_i(s)&=\sum\limits_{p=1}^PX_{ip}A_p(s)&+&\sum\limits_{j=1}^J\xi_{ij}\phi_j(s)&+&~~~~~~\epsilon_i(s) \end{aligned} \]
\(~~~\small Y_i(s)=X_i\cdot\) + \(~~~\small \xi_i\cdot\)
+ \(~~~\)
\[ Y_i(s)=\sum\limits_{p=1}^PX_{ip}A_p(s)+\sum\limits_{j=1}^J\xi_{ij}\phi_j(s)+\epsilon_i(s) \] When \(\xi_{ij}\) has variance dependent on \(X_i\), so does \(Y_i(t)\): \[ \begin{aligned} \small Cov[Y_i(s_1),Y_i(s_2)|X_i]&=\sum\limits_{1\le j_1,j_2\le J}Cov(\xi_{ij_1},\xi_{ij_2}|X_i)\phi_{j_1}(s_1)\phi_{j_2}(s_2)\\ &+Var[\epsilon_i(s_1)]\cdot\mathbb{I}[s_1=s_2] \end{aligned} \]
\[ Y_i(s)=\sum\limits_{p=1}^PX_{ip}A_p(s)+\sum\limits_{j=1}^J\xi_{ij}\phi_j(s)+\epsilon_i(s) \] When \(\xi\) has nonzero skewness/excess kurtosis, so would \(Y_i(t)\): \[ \begin{aligned} &E\{[Y_i(s)-EY_i(s)]^3|X_i\}\\=&\sum\limits_{1\le j_1,j_2,j_3\le J}E[\xi_{ij_1}\xi_{ij_2}\xi_{ij_3}|X_i]\phi_{j_1}(s)\phi_{j_2}(s)\phi_{j_3}(s) \end{aligned} \]
We choose to model \(Var[\xi_{ij}]=E[\xi_{ij}^2]\) using log-linear model: \[ \log E[\xi_{ij}^2]=X_i'\beta_j \] and \(\mathrm{Cor}((\xi_{i1},\ldots,\xi_{iJ})')=C\), which is irrelevant of \(X\).
Now the covariance operator for \(Y_i(t)\) becomes \[ \tiny \begin{aligned} \Sigma_i(s_1,s_2)&=\sum\limits_{j_1=1}^J\sum\limits_{j_2=1}^J(e^{X_i'\boldsymbol{\beta}_{j_1}})^{1/2}(e^{X_i'\boldsymbol{\beta}_{j_2}})^{1/2}C_{j_1,j_2}\phi_{j_1}(s_1)\phi_{j_2}(s_2)\\ &+\sigma^2_\epsilon(s_1)\mathbb{I}(s_1=s_2) \end{aligned} \] - \(Y_i(t)|X_i\) can be correlated even if \(C\) is diagonal.
One natural way is to do spline regression \(\small A_p(s)=\sum\limits_{l=1}^L\gamma_{pl}B_l(s)\).
\[ Y_i(s_j)\sim\sum\limits_{p,l}\gamma_{pl}[X_{ip}B_l(s_j)] \]
In our case: \(N\approx7000,M=1440,p=13,L=24\)
Solution: Fast Univariate Inference (FUI)1
1-1: Perform OLS regression first using epoch-level data: \[ \color{grey}{Y_i(s_j)=}Y_{ij}\sim\sum\limits_{p}X_{ip}\tilde A_{pj} \]
1-2: Apply any smooth regression technique (e.g. LOESS, GAM) on \(\tilde A_{p1},\cdots\tilde A_{pM}\) to get a smooth estimate \(\hat A_p(s)\).
\[ \small \begin{aligned} Y_i(t)&=\color{grey}{\sum\limits_{p=1}^PX_{ip}A_p(s)}+{\sum\limits_{j=1}^J\xi_{ij}\phi_j(s)}+{\epsilon_i(s)} \end{aligned} \]
\[ \small \begin{aligned} Y_i(t)&=\color{grey}{\sum\limits_{p=1}^PX_{ip}A_p(s)}+{\sum\limits_{j=1}^J\xi_{ij}\phi_j(s)}+{\epsilon_i(s)} \end{aligned} \]
2-1: Regress residuals \(\small\hat r_i(s)=Y_i(s)-\sum\limits_{p}X_{ip}\hat A(s)\) on \(\small\phi_1(s),\cdots,\phi_J(s)\) with penalty to get score estimates \(\hat\xi_{ij}\). Record effective degrees of freedom \(edf_i\).
2-2: Perform GLM with quasi-Poisson model \[ \small \color{grey}{Var[\hat{\xi_{ij}}]\approx}E[\hat\xi_{ij}^2]=\exp(X_i'\beta_j);~~~ Var[\hat\xi_{ij}^2]\propto E[\hat\xi_{ij}^2] \]
\[ \small \begin{aligned} Y_i(t)&=\color{grey}{\sum\limits_{p=1}^PX_{ip}A_p(s)}+{\sum\limits_{j=1}^J\xi_{ij}\phi_j(s)}+{\epsilon_i(s)} \end{aligned} \]
2-3: Recover \(C\) from standardized scores \(\hat{\xi_{ij}^*}=\hat{\xi_{ij}}(\exp(X_i'\beta_j))^{-1/2}\).
2-4: Recover \(\sigma_\epsilon^2(s)\) via smoothing the squared-residual \[ \hat\epsilon_i(s)^2=\frac{[\hat r_i(s)-\sum_j\hat\xi_{ij}\phi_j(s)]^2}{\mathbf{1-edf_i/M}} \]
We fit Gamma GAMs on \(\hat\epsilon_i(s)\) against \(s\): \[ \small\hat\epsilon_i(s)\sim Gamma(\mathrm{shape}=\alpha,\mathrm{mean}=\sigma_\epsilon^2(s)) \] To speed up the calculation, we use the following trick1 \[ \small E[\ln\hat\epsilon_i^2(s)]=\psi(\alpha)+\ln\frac{\sigma^2_\epsilon(s)}{\alpha},~ Var[\ln\hat\epsilon_i^2(s)]=\psi'(\alpha) \]
\[ \small E[\ln\hat\epsilon_i^2(s)]=\psi(\alpha)+\ln\frac{\sigma^2(s)}{\alpha}, Var[\ln\hat\epsilon_i^2(s)]=\psi'(\alpha) \]
Note: If we average \(\hat\epsilon_i(s)\) at each time \(s\) first then smooth out the averages, we tend to underestimate the standard error of the noise variance estimator (by about 15%)
\[ \tiny \begin{aligned} Y_i(t)&=\color{grey}{\sum\limits_{p=1}^PX_{ip}A_p(s)}+{\sum\limits_{j=1}^J\xi_{ij}\phi_j(s)}+{\epsilon_i(s)} \end{aligned} \]
2-5: Perform linear regression on: \[ \tiny \begin{aligned} Skew(\hat\xi_{ij})\approx E[\tilde{\xi_{ij}^*}^3]&=X_i'\delta_{\text{skew},j}\\ Kurt(\hat\xi_{ij})\approx E[\tilde{\xi_{ij}^*}^4-3]&=X_i'\delta_{\text{kurt},j} \end{aligned} \] and other cross-products, such as \(\tilde{\xi_{i1}^*}^2\tilde{\xi_{i2}^*}\).1
2-6: Recover skewness/kurtosis of \(Y_i(s)|X_i\) using 2nd-4th moments of scores
Standard error estimated via nonparametric bootstrap, i.e. sample individuals w/ replacement
Pointwise Wald intervals: \(\pm 1.96\times SE\)
Correlation and Multiplicity Adjusted(CMA) interval1: \(\pm Q\times SE\)
Should we use symmetric CI when we no longer assume symmetry in the data?
\(\tiny Y_i(t)={\sum\limits_{p=1}^PX_{ip}{A_p(s)}}+\color{grey}{\sum\limits_{j=1}^J\xi_{ij}\phi_j(s)}+\color{grey}{\epsilon_i(s)}\)
1
\[ \tiny \begin{aligned} Y_i(t)&=\color{grey}{\sum\limits_{p=1}^PX_{ip}{A_p(s)}}+\color{grey}{\sum\limits_{j=1}^J\xi_{ij}\phi_j(s)}+{\epsilon_i(s)} \end{aligned} \]
\(\tiny Y_i(t)=\color{grey}{\sum\limits_{p=1}^PX_{ip}{A_p(s)}}+{\sum\limits_{j=1}^J\xi_{ij}\phi_j(s)}+{\epsilon_i(s)}\)
1
1
1 2
Work in Progress
\(Y_i(s)=X_i\cdot\) + \(\xi_i\)
+
\[ \tiny \begin{aligned} X_1&=1,X_2\sim Uniform(-30,30),\\ X_3&\sim Bernoulli(0.5),X_4\sim Bernoulli(0.3)\\ A(s)&=(A_1(s),A_2(s),A_3(s),A_4(s))\text{ where }\\ A_1(s)&=2.5-1.8\exp\{2(1-\cos\frac{(s-200)\pi}{720})\}\\ A_2(s)&=\begin{cases} 0.005-0.01(1-\cos\frac{(s-480)\pi}{960})&480\le s\le 1440\\ 0.005-0.01(1-\cos\frac{(s-480)\pi}{480})&0\le s< 480 \end{cases}\\ A_3(s)&=\begin{cases} 0.1-0.125(1-\cos\frac{(s+240)\pi}{540})&0\le s<300\\ 0.1-0.125(1-\cos\frac{(s-600)\pi}{300})&300\le s<600\\ 0.1&600\le s<1200\\ 0.1-0.125(1-\cos\frac{(s-1200)\pi}{1740})&1200\le s<1440 \end{cases}\\ A_4(s)&=\begin{cases} 0.02-0.125(1-\cos\frac{(s-180)\pi}{1200})&0\le s<180\\ 0.02-0.125(1-\cos\frac{(s-180)\pi}{240})&180\le s<420\\ -0.23+0.125(1-\cos\frac{(s-420)\pi}{1200})&420\le s<1440 \end{cases}\\ \sigma_\epsilon^2(s)&=0.1+0.35\exp[-4(1-\cos\frac{(s-360)\pi}{720})]+0.25\exp[-4(1-\cos\frac{s\pi}{720})] \end{aligned} \]
Sample \(\tiny \gamma\sim MVN(0,\Sigma)\) where \[\tiny\Sigma=\begin{pmatrix} 1& -0.4& 0.7& -0.4& -0.2 \\ -0.4& 1& -0.4& 0.5& -0.1 \\ 0.7& -0.4& 1& -0.5& 0.1 \\ -0.4& 0.5& -0.5& 1& -0.5\\ -0.2& -0.1& 0.1& -0.5& 1\\ \end{pmatrix}\]
Marginally transform \(\gamma_1,\gamma_2,\gamma_3,\gamma_5\) to Gamma distribution with shape parameters 3, 50, 20 and 4 and standardize to mean 0 and variance 1
Flip the sign of \(\gamma_1,\gamma_3\)
| Coverage % (Wald; CMA) | N=100 | N=300 | N=1000 |
|---|---|---|---|
| Fixed Effect \(A_2(t)\) | |||
| freq=1min | 93; 88 | 94; 87 | 96; 94 |
| freq=3min | 93; 86 | 92; 87 | 94; 94 |
| freq=10min | 95; 94 | 95; 92 | 95; 92 |
| Noise Variance | |||
| freq=1min | 95; 89 | 95; 96 | 95; 91 |
| freq=3min | 96; 92 | 95; 93 | 95; 93 |
| freq=10min | 95; 88 | 94; 91 | 94; 93 |
Cui, Erjia, Leroux, Andrew, Smirnova, Ekaterina and Crainiceanu, Ciprian M. (2021). Fast univariate inference for longitudinal functional models.
Crainiceanu, C.M., Goldsmith, J., Leroux, A. and Cui, E. (2024). Functional Data Analysis with R
Backenroth, Daniel, Goldsmith, Jeff, Harran, Michelle D., Cortes, Juan C., Krakauer, John W. and Kitago, Tomoko. (2018). Modeling motor learning using heteroscedastic functional principal components analysis
Cui, E., Leroux, A., Smirnova, E., & Crainiceanu, C. M. (2021). Fast Univariate Inference for Longitudinal Functional Models.
Reiss, Philip T., Huang, Lei and Mennes, Maarten. (2010). Fast function-on-scalar regression with penalized basis expansions.
Yi Zhao, Brian S Caffo, Xi Luo, For the Alzheimer’s Disease Neuroimaging Initiative, Longitudinal regression of covariance matrix outcomes